Tashkent
- Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
Kahardipraja, Patrick, Achtibat, Reduan, Wiegand, Thomas, Samek, Wojciech, Lapuschkin, Sebastian
Large language models are able to exploit in-context learning to access external knowledge beyond their training data through retrieval-augmentation. While promising, its inner workings remain unclear. In this work, we shed light on the mechanism of in-context retrieval augmentation for question answering by viewing a prompt as a composition of informational components. We propose an attribution-based method to identify specialized attention heads, revealing in-context heads that comprehend instructions and retrieve relevant contextual information, and parametric heads that store entities' relational knowledge. To better understand their roles, we extract function vectors and modify their attention weights to show how they can influence the answer generation process. Finally, we leverage the gained insights to trace the sources of knowledge used during inference, paving the way towards more safe and transparent language models.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (25 more...)
- Leisure & Entertainment (1.00)
- Media > Music (0.67)
- Media > Television (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
- Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
Filling the Gap for Uzbek: Creating Translation Resources for Southern Uzbek
Mamasaidov, Mukhammadsaid, Aral, Azizullah, Shopulatov, Abror, Inomjonov, Mironshoh
Southern Uzbek (uzs) is a Turkic language variety spoken by around 5 million people in Afghanistan and differs significantly from Northern Uzbek (uzn) in phonology, lexicon, and orthography. Despite the large number of speakers, Southern Uzbek is underrepresented in natural language processing. We present new resources for Southern Uzbek machine translation, including a 997-sentence FLORES+ dev set, 39,994 parallel sentences from dictionary, literary, and web sources, and a fine-tuned NLLB-200 model (lutfiy). We also propose a post-processing method for restoring Arabic-script half-space characters, which improves handling of morphological boundaries. All datasets, models, and tools are released publicly to support future work on Southern Uzbek and other low-resource languages.
- Asia > Uzbekistan > Toshkent Shahri > Tashkent (0.05)
- Asia > Afghanistan > Kabul Province > Kabul (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (4 more...)
Enhancement of Quantum Semi-Supervised Learning via Improved Laplacian and Poisson Methods
Gholipour, Hamed, Bozorgnia, Farid, Mohammadigheymasi, Hamzeh, Hambarde, Kailash, Mancilla, Javier, Proenca, Hugo, Neves, Joao, Challenger, Moharram
This paper develops a hybrid quantum approach for graph-based semi-supervised learning to enhance performance in scenarios where labeled data is scarce. We introduce two enhanced quantum models, the Improved Laplacian Quantum Semi-Supervised Learning (ILQSSL) and the Improved Poisson Quantum Semi-Supervised Learning (IPQSSL), that incorporate advanced label propagation strategies within variational quantum circuits. These models utilize QR decomposition to embed graph structure directly into quantum states, thereby enabling more effective learning in low-label settings. We validate our methods across four benchmark datasets like Iris, Wine, Heart Disease, and German Credit Card -- and show that both ILQSSL and IPQSSL consistently outperform leading classical semi-supervised learning algorithms, particularly under limited supervision. Beyond standard performance metrics, we examine the effect of circuit depth and qubit count on learning quality by analyzing entanglement entropy and Randomized Benchmarking (RB). Our results suggest that while some level of entanglement improves the model's ability to generalize, increased circuit complexity may introduce noise that undermines performance on current quantum hardware. Overall, the study highlights the potential of quantum-enhanced models for semi-supervised learning, offering practical insights into how quantum circuits can be designed to balance expressivity and stability. These findings support the role of quantum machine learning in advancing data-efficient classification, especially in applications constrained by label availability and hardware limitations.
- Europe > Portugal (0.04)
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
- Africa > Mozambique > Sofala Province > Beira (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
ADMC: Attention-based Diffusion Model for Missing Modalities Feature Completion
Zhang, Wei, Chen, Juan, Wang, Yanbo J., Zhu, En, Yang, Xuan, Wang, Yiduo
Multimodal emotion and intent recognition is essential for automated human-computer interaction, It aims to analyze users' speech, text, and visual information to predict their emotions or intent. One of the significant challenges is that missing modalities due to sensor malfunctions or incomplete data. Traditional methods that attempt to reconstruct missing information often suffer from over-coupling and imprecise generation processes, leading to suboptimal outcomes. To address these issues, we introduce an Attention-based Diffusion model for Missing Modalities feature Completion (ADMC). Our framework independently trains feature extraction networks for each modality, preserving their unique characteristics and avoiding over-coupling. The Attention-based Diffusion Network (ADN) generates missing modality features that closely align with authentic multimodal distribution, enhancing performance across all missing-modality scenarios. Moreover, ADN's cross-modal generation offers improved recognition even in full-modality contexts. Our approach achieves state-of-the-art results on the IEMOCAP and MIntRec benchmarks, demonstrating its effectiveness in both missing and complete modality scenarios.
- Asia > Uzbekistan > Toshkent Shahri > Tashkent (0.04)
- Asia > China > Hunan Province > Changsha (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
MLlm-DR: Towards Explainable Depression Recognition with MultiModal Large Language Models
Zhang, Wei, Chen, Juan, Zhu, En, Cheng, Wenhong, Li, YunPeng, Wang, Yanbo J.
Automated depression diagnosis aims to analyze multimodal information from interview videos to predict participants' depression scores. Previous studies often lack clear explanations of how these scores were determined, limiting their adoption in clinical practice. While the advent of LLMs provides a possible pathway for explainable depression diagnosis, current LLMs capable of processing multimodal data lack training on interview data, resulting in poor diagnostic performance when used directly. In this paper, we propose a novel multimodal large language model (MLlm-DR) that can understand multimodal information inputs and supports explainable depression diagnosis. MLlm-DR integrates a smaller LLMs and a lightweight query module (LQ-former). Specifically, the smaller LLMs is designed to generate depression scores and corresponding evaluation rationales. To enhance its logical reasoning for domain-specific tasks while maintaining practicality, we constructed a robust training dataset to fine-tune it. Meanwhile, the LQ-former captures depression-related features from speech and visual data, aiding the model's ability to process multimodal information, to achieve comprehensive depression diagnosis. Our approach achieves state-of-the-art results on two interview-based benchmark datasets, CMDC and E-DAIC-WOZ, demonstrating its effectiveness and superiority.
- Asia > China > Shanghai > Shanghai (0.05)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Asia > China > Hunan Province > Changsha (0.04)
- (5 more...)
Ising Models with Hidden Markov Structure: Applications to Probabilistic Inference in Machine Learning
Herrera, F., Rozikov, U. A., Velasco, M. V.
In this paper, we investigate tree-indexed Markov chains (Gibbs measures) defined by a Hamiltonian that couples two Ising layers: hidden spins \(s(x) \in \{\pm 1\}\) and observed spins \(σ(x) \in \{\pm 1\}\) on a Cayley tree. The Hamiltonian incorporates Ising interactions within each layer and site-wise emission couplings between layers, extending hidden Markov models to a bilayer Markov random field. Specifically, we explore translation-invariant Gibbs measures (TIGM) of this Hamiltonian on Cayley trees. Under certain explicit conditions on the model's parameters, we demonstrate that there can be up to three distinct TIGMs. Each of these measures represents an equilibrium state of the spin system. These measures provide a structured approach to inference on hierarchical data in machine learning. They have practical applications in tasks such as denoising, weakly supervised learning, and anomaly detection. The Cayley tree structure is particularly advantageous for exact inference due to its tractability.
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Asia > Uzbekistan > Toshkent Shahri > Tashkent (0.04)
- Asia > Singapore (0.04)
- (3 more...)
Non-Intrusive Load Monitoring Based on Image Load Signatures and Continual Learning
Non-Intrusive Load Monitoring (NILM) identifies the operating status and energy consumption of each electrical device in the circuit by analyzing the electrical signals at the bus, which is of great significance for smart power management. However, the complex and changeable load combinations and application environments lead to the challenges of poor feature robustness and insufficient model generalization of traditional NILM methods. To this end, this paper proposes a new non-intrusive load monitoring method that integrates "image load signature" and continual learning. This method converts multi-dimensional power signals such as current, voltage, and power factor into visual image load feature signatures, and combines deep convolutional neural networks to realize the identification and classification of multiple devices; at the same time, self-supervised pre-training is introduced to improve feature generalization, and continual online learning strategies are used to overcome model forgetting to adapt to the emergence of new loads. This paper conducts a large number of experiments on high-sampling rate load datasets, and compares a variety of existing methods and model variants. The results show that the proposed method has achieved significant improvements in recognition accuracy.
- Asia > Uzbekistan > Toshkent Shahri > Tashkent (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Energy (0.67)
- Electrical Industrial Apparatus (0.50)
- Education > Educational Setting (0.34)
Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks
Nezhad, Sina Bagheri, Agrawal, Ameeta
Large language models (LLMs) often struggle to perform multi-target reasoning in long-context scenarios where relevant information is scattered across extensive documents. To address this challenge, we introduce NeuroSymbolic Augmented Reasoning (NSAR), which combines the benefits of neural and symbolic reasoning during inference. NSAR explicitly extracts symbolic facts from text and generates executable Python code to handle complex reasoning steps. Through extensive experiments across seven languages and diverse context lengths, we demonstrate that NSAR significantly outperforms both a vanilla RAG baseline and advanced prompting strategies in accurately identifying and synthesizing multiple pieces of information. Our results highlight the effectiveness of combining explicit symbolic operations with neural inference for robust, interpretable, and scalable reasoning in multilingual settings.
- Asia > India > Maharashtra > Mumbai (0.04)
- Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (19 more...)